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31 


ABSTRACT 


32 

33 Coronaviruses (CoVs) demonstrate great potential for interspecies transmission, including 

34 zoonotic outbreaks. Although bovine coronavirus (BCoV) strains are frequently circulating in 

35 cattle farms worldwide, causing both enteric and respiratory disease, little is known about their 

36 genomic evolution. We sequenced and analyzed the full-length spike (S) protein gene of thirty- 

37 three BCoV strains from dairy and feedlot farms 2002 to 2010 in Sweden and Denmark. Amino 

38 acid (aa) identities were >97% for the BCoV strains analyzed in this work. These strains formed 

39 a clade together with Italian BCoV strains and highly similar to human enteric coronavirus 

40 HECV-4408/US/94. A high similarity was observed between BCoV, canine respiratory 

41 coronavirus (CRCoV) and human coronavirus OC43 (HCoV-OC43). Molecular clock analysis of 

42 the S gene sequences dated a common ancestor of BCoV and CRCoV to 1951, while a common 

43 ancestor of BCoV and HCoV-OC43 was dated to 1899. BCoV strains showed the lowest 

44 similarity to equine coronavirus (ECoV) placing the date of divergence at the end of 18 th century. 

45 Two strongly positive selection sites were detected along the receptor binding subunit of S 

46 protein gene; spanning aa residues 109-131 and 495-527. On the contrary, the fusion subunit was 

47 observed to be under negative selection. Selection pattern along S glycoprotein implies adaptive 

48 evolution of BCoVs, suggesting a successful mechanism for BCoV to continuously circulate 

49 among cattle and other ruminants without disappearance. 

50 
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51 


INTRODUCTION 


52 

53 Bovine coronavirus (BCoV) is a member of the Coronaviridae family, order Nidovirales 

54 (Cavanagh, 1997). Corona viruses (CoVs) possess the largest viral RNA genome in nature. 

55 Recently, the International Committee for Taxonomy of Viruses (ICTV) has proposed two sub- 

56 families for Coronaviridae: Coronavirinae and Torovirinae, the former comprising three groups 

57 but renamed as Alphacoronavirus, Betacoronavirus, and Gammacoronavirus, respectively (de 

58 Groot et al., 2012) and with a novel (but yet to be approved) genus, provisionally named 

59 Deltacoronavirus (Woo et al., 2012). Four separate lineages (A through D), some of them 

60 encompassing multiple virus species, are commonly recognized within the genus 

61 Betacoronavirus. BCoV, together with human coronavirus OC43 (HCoV-OC43), equine 

62 coronavirus (ECoV) and porcine hemagglutinating encephalomyelitis virus (PHEV) belongs to 

63 the virus species Betacoronavirus 1 of the lineage A of the genus Betacoronavirus (de Groot et 

64 al., 2012). A recently isolated canine respiratory coronavirus (CRCoV) has also shown a high 

65 genetic similarity to Betacoronavirus 1 (Erles et al., 2007). 

66 

67 BCoV is an enveloped virus with a single-stranded, positive-sense, non-segmented RNA genome 

68 of approximately 31 kb (Clark, 1993). A 4092 nucleotide (nt) fragment of BCoV genome 

69 encodes the large petal-shaped surface spike (S) protein. This is a type 1 membrane glycoprotein 

70 of 1363 amino acids that comprises two hydrophobic regions, an amino-terminal (N-terminal) 

71 signal sequence and carboxyl-terminal (C-terminal) membrane anchor (Parker et al., 1990). The 

72 S protein is cleaved by an intracellular protease between aa 768 and 769 to form two functionally 

73 distinct subunit domains, a variable SI N-terminal domain and the more conserved S2 C- 
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74 terminal domain (Abraham et al., 1990). The SI subunit is a peripheral protein, mediating virus 

75 binding to host-cell receptors (Li, 2012; Peng et al., 2012), haemagglutinating activity (Schultze 

76 et al., 1991) and inducing neutralizing antibodies (Yoo & Deregt, 2001). The S2 subunit is a 

77 transmembrane protein which mediates fusion of viral and cellular membranes (Yoo et al., 

78 1991a). 

79 

80 BCoV is the causative agent of neonatal calf diarrhea (CD), winter dysentery (WD) in adult 

81 cattle (Alenius et al., 1991; Mebus et al., 1973; Saif et al., 1988), and respiratory tract disorders 

82 in cattle of all ages (Cho et al., 2001; Decaro et al., 2008a; Lathrop et al., 2000). This infection is 

83 not effectively controlled in the herds by current commercial vaccines (Saif, 2010). BCoV 

84 negatively impacts cattle industry due to reduced milk production, loss of body condition and 

85 also through the death of young animals (Clark, 1993; Saif, 2010). BCoV outbreaks most often 

86 happen during fall and winter (Clark, 1993). However, studies from various climate regions have 

87 also reported BCoV outbreaks in the warmer seasons (Bidokhti et al., 2012; Decaro et al., 

88 2008b; Park et al., 2006). 

89 

90 Studies have shown high prevalence of BCoV infections in cattle farms in many countries 

91 (Fulton et al., 2011; Paton et al., 1998; Saif, 2010; Traven et al., 2001). Also BCoV-like 

92 coronaviruses transmissible to gnotobiotic calves have been found among various wild ruminants 

93 (Alekseev et al., 2008; Tsunemitsu et al., 1995). The public health impact of BCoVs has also 

94 been raised due to the isolation of a BCoV-like human enteric coronavirus - 4408/US/94 

95 (HECV-4408/US/94) from a child with acute diarrhoea (Zhang et al., 1994), and also the 

96 outbreaks of severe acute respiratory syndrome CoV (SARS-CoV) (Groneberg et al., 2003; 
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97 Zhong & Wong, 2004). Molecular evolutionary analysis of HCoV-OC43 isolates suggests BCoV 

98 as their genetically closest counterpart compared to other CoV species (Vijgen et al., 2006). 

99 Recently, a novel coronavirus HCoV-EMC was found that has been circulating in the Middle 

100 East and caused death with similar clinical signs to SARS-CoV (Al-Ahdal et al., 2012; Zaki et 

101 al., 2012). Such veterinary and public health concerns rationalize the study of the genetic 

102 diversity and evolution of BCoV strains and their relationship with the other Betacoronaviruses. 

103 

104 The S gene sequence of BCoV has been exploited for epidemiological (Bidokhti et al., 2012; 

105 Decaro et al., 2008c; Hasoksuz et al., 2002; Jeong et al., 2005; Lathrop et al., 2000; Liu et al, 

106 2006; Martinez et al., 2012) and evolutionary (Vijgen et al, 2005b; Woo et al., 2012) studies. So 

107 far, no study has systematically defined the positive selection pattern of the S protein of BCoV 

108 strains which is probably important for BCoV adaptive evolution. In the present study, to better 

109 understand the epidemiologic dynamics of BCoV and to investigate the adaptive evolutionary 

110 process of BCoVs, we sequenced the full-length S gene and analyzed molecular epidemiology, 

111 evolution and selective pressures of this vims in cattle herds in Sweden and Denmark. Reference 

112 strains from other hosts in Betacoronavirusl including human, wild ruminants, pig and horse and 

113 also CRCoV from dog were included in this analysis to estimate their time of divergence and 

114 update their genetic relationship. 

115 
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116 RESULTS 

117 

118 Sequence data and genome analysis 

119 

120 Comparative analysis of the S gene (4092 nt) indicated that all 33 Swedish and Danish strains 

121 (GenBank accession numbers: KF169908-KF169940) shared a high degree of sequence identity 

122 both at nt level (>97.8%) and deduced aa level (>97.4%). Compared with the 

123 BCoV/Mebus/US/72 strain, 78 to 113 nt substitutions (97.2% to 97.9% sequence identity) were 

124 found resulting in 37-54 aa changes (96% to 97.2% sequence identity) within the entire S gene of 

125 the strains. The 100% identical strains SWE/I/07-3, SWE/I/07-4 and SWE/I/07-5 from Sweden 

126 were found to be 99.7% similar to the strain SWE/P/09-1. SWE/I/07-3 and SWE/I/07-4 were 

127 obtained from different cows with enteric disease in the same herd in Gotland island in south- 

128 eastern Sweden. SWE/I/07-5 was obtained from another herd in Gotland island during the same 

129 time. SWE/P/09-1 was obtained from a cow with respiratory disease in a herd in south-western 

130 Sweden. 

131 

132 SWE/N/05-1 and SWE/N/05-2 showing 8 nt substitutions (99.8% identity) were sampled from 

133 different calves with enteric disorders at the same occasion in a large dairy herd. The oldest 

134 strain, SWE/C/92 showed the highest identity (nt 98.7%, aa 98.7%) to an old strain, DEN/03-3, 

135 and the lowest identity (nt 97.8%, aa 97.4%) to a recent strain, SWE/M/10-1. SWE/Y/10-3 from 

136 northern Sweden and SWE/P/10-4 from south-western Sweden showed 99.9% nt identity. These 

137 strains were obtained during the same year from different regions. 

138 
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139 The analysis of the predicted S proteins of the present 33 BCoV strains revealed a potential N- 

140 terminal signal peptide of about 14 amino acids by SignalP-HMM and SignalP-NN, respectively. 

141 A potential S1/S2 cleavage site located after RRSRR, identical for BCoV (Abraham et al., 1990) 

142 and some HCoV-OC43 (Lau et al., 2011), was identified in the S proteins of all strains excluding 

143 the 2010 strains. The R-to-K aa change in the 764 position, leading to a KRSRR motif, was 

144 observed in the S proteins of SWE/Y/10-3 and SWE/P/10-4. The A-to-E aa change in the 769 

145 position, leading to a RRSRRE motif, was observed downstream of the potential cleavage site in 

146 the S proteins of SWE/M/10-1 and SWE/M/10-2. It has been suggested that changes in the last 

147 position of the motif affect the S protein cleavability (Vijgen et al., 2005a). This cleavage 

148 process is believed to play an important role in the fusion activity and viral infectivity of BCoV 

149 (Storz et al., 1981; Vijgen et al., 2005a). More sequence data and experimental studies are 

150 required to clarify the important role of these changes in the cleavage site of BCoV. The analysis 

151 of the S protein showed 20 potential N-linked glycosylation sites in all Swedish and Danish 

152 BCoV strains, with nine NXS (T133, M359, V437, P444, S696, D788, F895,11234, Q1288) and 

153 eleven NXT (T59, F198, A649, R676, N714, S739, C937, N1194, Y1224, Q1253, V1267) sites. 

154 

155 Phylogenetic tree 

156 

157 The analyzed samples showed low variability. Within the 4092 nt of the complete sequences of 

158 the S protein gene, 340 nt were variable (8.3%). At the aa level the variation was slightly larger 

159 (147 variable aa residues, 10.8%). Nucleotide p-distances among strains ranged between 0.1 and 

160 2.7%. This high degree of sequence identity is reflected in the NJ tree (Fig. 1): all Swedish and 

161 Danish strains from 2002 to 2010 clustered together as a unique clade with Italian strains; 
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162 BuCoV/ITA/179-07-11, BCoV/438/06-2/ITA and BCoV/ITA/339/06. The oldest Swedish strain 

163 SWE/C/92 was branched away from this clade and clustered into a separate clade with 

164 BCoV/GER/M80844/89 and human isolate HECV-4408/US/94. The remaining reference strains 

165 derived from cattle and wild ruminants clustered irrespective of the host. The CRCoV clade was 

166 most closely related to the BCoV and BCoV-like coronavirus clade; while HCoV-OC43, PHEV 

167 and ECoV clusters were more distant (Fig. 1). 

168 

169 Fifty-three nt differences were found between strains SWE/M/06-3 and SWE/M/06-4 (98.7% nt 

170 similarity, 98.1% aa similarity). These strains were obtained from two dairy herds with CD 

171 symptoms sampled at the same time in southern Sweden. SWE/M/06-3 clustered with 

172 SWE/AC/08-1, SWE/E/08-2, SWE/Z/07-1, SWE/C/07-2, SWE/C/07-6 and SWE/U/09-3 (Fig. 1), 

173 sharing more than 99.4% sequence similarity. 

174 

175 Evolutionary rate and estimation of divergence dates 

176 

177 Molecular clock analysis of Swedish and Danish BCoV strains and reference strains of 

178 Betacoronavirusl using Bayesian coalescent approach was performed to estimate their mean rate 

179 of evolution and their time to the most recent common ancestor (TMRCA) which are shown in 

180 detail in Table 3. TMRCA of CRCoV and BCoV was dated to 1951. The mean evolution rate of 

181 Swedish and Danish BCoV strains compared to CRCoV was also estimated 4.4xl0' 4 substitution 

182 per site per year. TMRCA analysis estimated earlier divergence of BCoV strains from HCoV- 

183 OC43 (1899), PHEV (1847) and ECoV (1797). The mean evolution rate of Swedish and Danish 

184 BCoV strains compared to HCoV-OC43 was 4.1xl0' 4 substitutions per site per year, 7.6xl0' 4 
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185 compared to PHEV and 7.9xl0~ 4 compared to ECoV. TMRCA of BCoV compared to CoVs 

186 from wild ruminants was dated to 1963 and the mean rate of evolution was estimated to be 

187 4.4x1 O' 4 substitution per site per year. Swedish and Danish BCoV strains sequenced in this study 

188 showed the highest mean rate of evolution to BCoV reference strains and HECV-4408/US/94; 

189 8.7x1 O' 4 and 8.3x1 O' 4 substitution per site per year, respectively. This resulted in estimating 

190 almost the same year for TMRCA, 1978 and 1977, respectively (Table 3). 

191 

192 Results from bootscan analysis were in line with the observations described above and in 

193 phylogenetic tree (Fig. 1). Bootscan analysis showed a number of possible recombination sites 

194 when the S gene of BCoV strains were used as the query. Most of the region exhibits higher 

195 bootstrap support for the clustering of strains BCoV with CRCoV, except upstream of position 

196 500, where higher bootstrap support for clustering with strains HCoV-OC43 was observed. 

197 Similar results were obtained when strains CRCoV were subjected to bootscan analysis (Fig. 

198 SI). When the S gene of HCoV-OC43 strains were used as the query, downstream of position 

199 1800 exhibits higher bootstrap support for the clustering of strains HCoV-OC43 with PHEV. 

200 Similar results were obtained when strains PHEV were subjected to bootscan analysis (Fig. SI). 

201 

202 Selective pressure sites 

203 

204 The selection profiles of the aa sequence of all 33 Swedish and Danish BCoV strains showed two 

205 general patterns within the S protein. The cumulative dN-dS revealed that aa residues 109-131 

206 and 495-527 of the SI subunit were under strong positive selection (Fig. 2a). Amino acid 

207 residues 36-97, 315-420, 498-713, 910-1032, 1059-1234 and 1245-1279 were under negative 
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208 selection. They covered most of the S2 subunit, indicating that S2 is relatively stable in BCoV 

209 (Fig. 2a). 

210 

211 The SNAP analysis identified 133 positively selected sites. 89 of them are in SI and 44 in S2 

212 domain (Fig. 2b). Several of these sites were also identified by the REL method at posterior 

213 probability p > 90% level. The following positive selection sites were identified by SNAP and 

214 REL methods: 35, 112, 113, 115, 143, 147, 151, 157, 188, 257, 447, 458, 471, 482, 499, 501, 

215 503, 510, 523, 525, 543, 546, 573, 578, 590, 596, 718, 722, 888, and 1239 (Table 4). 

216 

217 Protein modelling comparisons 

218 

219 To determine if a homology model of the S protein for HECV-4408/US/94, SWE/C/92, DEN/03- 

220 3, SWE/M/10-1 and GER/V270/83 could be generated, each of these five sequences were 

221 searched individually against the Protein Data Bank (PDB) entries 

222 (http://www.rcsb.org/pdb/home/home.do) using default parameters. Based on the Z-score, all of 

223 these S protein sequences of BCoVs had the highest structural similarity to the crystal structure 

224 of murine hepatitis virus (PBD ID: 3R4D). Notably, the SI sequences of the 33 BCoV strains 

225 contain a putative receptor binding domain (aa residues 326 to 540, Fig. 2) with 94.8 to 97.6% aa 

226 identities to sequences of BCoV/Mebus/US/72 and GER/V270/83. This part of the BCoV S 

227 proteins had the highest sequence similarity of the SARS receptor-binding domain- like 

228 superfamily (Scop ID: 143587), spanning aa residues 328-493 of the S protein of SARS; the so 

229 called C-domain (Wong et al., 2004). Sialic acid is known to be the receptor for S protein 

230 binding in BCoV, although the receptor-binding domain is not well defined (Schultze et al.. 
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231 1991). The BCoV S protein also contains a N-terminal domain (NTD) spanning aa residues 15 to 

232 298, as recently defined in detail (Peng et al ., 2012), with 92.9 to 95% aa identities to sequences 

233 of BCoV/Mebus/US/72 and GER/V270/83. 

234 

235 Default parameters were used in I-TASSER to predict structures of these proteins as explained in 

236 the Materials and Methods section. Results indicated that NTD and putative C-domain of SI 

237 were structurally similar for HECV-4408/US/94 and SWE/C/92 (Fig. 3a, b). This similarity is 

238 clearly illustrated when the two structures are aligned (Fig. 3c). In contrast, the predicted 

239 structures for SWE/M/10-1, and GER/V270/83 were substantially divergent while DEN/03-3 

240 shows an intermediate conformation (Fig. 3d-f). Also in the S2 region HECV-4408/US/94 and 

241 SWE/C/92 differed in conformation compared to the other strains. The residues primarily 

242 predicted as potential receptor binding sites based on homology with the S protein of SARS were 

243 used in the generation of structural models. Notably, parts of the putative receptor binding 

244 domain and of the NTD were found to be in the strong positively selected regions on the surface 

245 of SI subunit (Fig. 3g, residues coloured green and red in SWE/C/92). 

246 
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247 DISCUSSION 


248 

249 Circulation patterns of BCoV strains 

250 

251 This is the first evolution study to include full-length S gene sequences of BCoV strains obtained 

252 from European countries. The twenty-six Swedish and seven Danish BCoV strains sequenced in 

253 this study show low genetic diversity that result in their clustering as a unique clade in the 

254 phylogenetic tree (Fig. 1). We show based on the full-length S gene that there are no consistent 

255 differences between BCoV strains obtained from respiratory and enteric disease. This is in 

256 accordance with our previous study of partial S sequences (Bidokhti et al., 2012). In two herds, 

257 identical sequences (e.g. SWE/02-1 and SWE/I/07-3) were found in different cattle sampled at 

258 the same occasion supporting previous findings that a herd disease outbreak is caused by a 

259 dominant strain (Bidokhti et al., 2012; Liu et al, 2006). However, in a large dairy herd (>200 

260 cows) we found two slightly different (99.8%) CD strains, SWE/N/05-1 and SWE/N/05-2, which 

261 were circulating at the same time. This finding indicates that strains with genetic diversity, 

262 though limited, can circulate in such herds. Large dairy herds were previously found to have a 

263 higher incidence of BCoV infection (Ohlson et al., 2010; Smith et al., 1998) which is consistent 

264 with the concept that large herds may foster a favorable environment for virus introduction and 

265 circulation of the strains. 

266 

267 A high similarity was observed between Italian and Swedish strains. We also identified a high 

268 similarity (99.4%) between the strain SWE/M/06-3 and six other strains that circulated in 2007 

269 to 2009 in distant regions of Sweden, implying that certain strains may have the potential to 
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270 spread directly or indirectly to distant regions or to other countries. No identical strains obtained 

271 from different epidemic seasons have been identified, but some strains were highly similar. High 

272 stability of certain BCoV strains was shown by the finding of identical strains in Gotland island 

273 in 2007 (e.g. SWE/I/07-3) and a highly similar strain obtained from another region in 2009 

274 (SWE/P/09-1). Highly similar strains were also found in different regions in 2010 (SWE/Y/10-3, 

275 SWE/P/10-4). This suggests that these BCoV strains were part of common transmission chains. 

276 This data supports previous findings that S gene sequences can provide data to clarify the 

277 transmission routes of BCoV strains (Bidokhti et al., 2012; Kanno et al., 2013). 

278 

279 Rate of evolution of BCoV strains 

280 

281 This evolutionary analysis encompassed a large data set of Betacoronavirusl sequences of full- 

282 length S gene obtained over 45 years (1965-2010), including newly sequenced Swedish and 

283 Danish BCoV strains from the last decade and one strain from 1992. Sampling over time 

284 provides us with heterochronous data to calculate an evolutionary rate and to estimate the time of 

285 divergence of the recent BCoV sequences. The estimated rate of nt substitution in the S gene of 

286 BCoV (8.7xl0‘ 4 substitution /site /year) is comparable to that observed as standard range (orders 

3 -5 

287 of 10" to 10" ) in other rapidly evolving RNA viruses, such as nonstructural protein 2 (NSP2) of 

288 rotavirus A (Donker & Kirkwood, 2012) and E gene of Dengue virus 3 (Sail et al., 2010). 

289 TMRCA estimate for BCoV strains in this study compared to published BCoV S gene sequences 

290 from other countries was 1978 (95%CI: 1974 to 1981). This time period is even shorter than 

291 expected results reported previously (Vijgen et al., 2006), showing a recent divergence during 

292 the last 60 years for BCoVs; 1944 (95%CI: 1910 to 1963). This implies the high ability of BCoV 
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293 to adapt to cattle population and spread over a large geographical region in a relatively short 

294 period of time. 

295 

296 Molecular clock analysis of the spike gene of the recent BCoV strains and HCoV-OC43 strains 

297 estimated an evolutionary rate in the order of 4.1xl0' 4 substitution per site per year, which is 

298 similar to a previous estimate of 4.7xl0' 4 substitution per site per year (Vijgen et al., 2005b). 

299 Bayesian coalescent approach dated TMRCA around 1899, highly similar to the previous 

300 estimate of around 1890 (Vijgen et al., 2005b). Evolutionary analysis of our BCoV strains along 

301 with other virus species in Betacoronavirusl demonstrated a closer relationship of BCoV to 

302 canine and human CoVs than to porcine and equine CoVs. TMRCA of CoVs is in accordance 

303 with their clustering in the phylogenetic tree (Fig. 1). The time of divergence of BCoV and 

304 CRCoV strains was estimated to have occurred five decades after that of BCoV and HCoV- 

305 OC43 strains, suggesting a closer common ancestor of the former. The spike protein of CRCoV- 

306 4182/UK/03 has been shown to have a higher genetic similarity to BCoV/Mebus/US/72 and 

307 BCoV/LY 138/US/65 than to HCoV-OC43/VR759/UK/6 (Erles et al, 2007). In that study, the 

308 cross-reactivity of CRCoV-4182/UK/03 with polyclonal antisera against BCoV was also shown 

309 (Erles et al., 2007). This corresponds to what is illustrated in the phylogenetic tree (Fig. 1); the 

310 clade of ruminant CoVs is clustered closer to the clade of CRCoV strains than to the other virus 

311 species in Betacoronavirusl. At the tree level, CoVs from bovines and several wild ruminant 

312 species clustered closely together, implying that such interspecies transmission of CoVs may 

313 occur as suggested previously (Alekseev et al., 2008). 

314 
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315 In this study, we reported a close genetic relationship (98.9% nt identity, 98.6% aa identity) and 

316 high simulated structural similarity of the S protein of HECV-4408/US/94 with a BCoV field 

317 strain, SWE/C/92. The infectivity of HECV-4408/US/94 for gnotobiotic calves and complete 

318 cross-protection against BCoV/DB2/84 isolate showing 98.2% aa identity (98.6% nt identity) to 

319 HECV-4408/US/94 in the S protein has been experimentally confirmed (Han et al., 2006). Thus, 

320 the similarity between SWE/C/92 and HECV-4408/US/94 S protein conformation further 

321 supports the hypothesis of possible interspecies transmission of these viruses. Future studies to 

322 find novel strains of Betacoronavirusl and determination of the structure of the S protein would 

323 greatly assist in determining how such interspecies transmissions occur. 

324 

325 Positive selection on the S protein 

326 

327 The selection profiles identified two main patterns within the subunit domains SI and S2 of the S 

328 protein. The SI subunit is exposed on the surface of the viral particle, and is the target of 

329 neutralizing antibodies (Deregt & Babiuk, 1987; Yoo & Deregt, 2001; Yoo et al., 1991b). The 

330 SI subunit has two domains with a clear positive selection pattern (Fig. 2). Positively selected 

331 fragments of genes encoding viral proteins exposed on the surface of the capsid have been 

332 documented in other viruses, such as in porcine circovirus type 2 (PCV2) (Olvera et al., 2007) 

333 and porcine parvovirus (PPV) (Shangjin et al., 2009). There is an association between positively 

334 selected sites along SI subunit identified in this study and mapped neutralizing epitopes. 

335 Epitopic fragments spanning aa residues 324- 720 of the SI subunit of BCoV and the N-terminus 

336 of the S2 subunit spanning aa residues 769-798 have been previously recognized using 

337 monoclonal antibodies (MAbs) (Vautherot et al., 1992a; Yoo et al., 1991b). A polymorphic 
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338 region spanning aa residues 456- 592 has also been shown by sequence analysis of BCoV strains 

339 (Rekik & Dea, 1994). It has been reported that mutations in the SI and the N-terminus of the S2 

340 sequence often result in changes in antigenicity (Kanno et al., 2013; Vautherot et al, 1992b; Yoo 

341 & Deregt, 2001). Likewise, parts of the putative receptor binding domain defined in this study 

342 and the NTD defined in detail in a previous study (Peng et al., 2012) were shown to be under 

343 strong positive selection in the BCoV strains. Taken together, the strong positively selected 

344 motifs among the S protein may thus be associated with the immune response and receptor- 

345 binding and would thus be important in future BCoV vaccine development. The negative 

346 selection pattern of the S2 subunit is also reported (Fig. 2). Negative selection is usually reported 

347 in genome fragments with essential functions in the viral lifecycle (Yang, 2005). For example, 

348 extensive syncytia formation was observed in cells infected with an S2 recombinant of BCoV 

349 (Yoo et al., 1991a). The structure of the SARS-CoV S2 fusion protein core has been shown to 

350 provide a framework for the design of entry inhibitors that could be used in the therapeutic 

351 intervention against this virus (Supekar et al., 2004). Thus we speculate that the S2 subunit, 

352 except its N-terminus, would mostly interact with cellular compartments rather than immune 

353 system elements of the host. 

354 

355 Vaccination with an inactivated vaccine against BCoV has been used very restrictedly in 

356 Swedish cattle herds. Thus we conclude that selective pressure sites observed in the receptor 

357 binding subunit of S protein gene of BCoV strains indicate a natural mode of evolution that is 

358 mainly due to exposure to the host immune system. Currently available vaccines are based on old 

359 enteric BCoV strains, genetically and antigenically different from currently circulating BCoV 

360 strains (Fulton et al., 2013). Thus, continuous monitoring of sequence changes in positive 
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361 selection sites may provide potentially useful data for identifying future dominant epidemic 

362 strains. This can then help to update the vaccine strains. 

363 

364 Studies are also warranted to detect the emergence of new genotypes and recombinants of BCoV 

365 as well as other betacoronaviruses and to assess their significance and potential in causing future 

366 epidemics. Nevertheless, it should be noted that the sequencing of a single gene may not be 

367 sufficient to define the genotypes of BCoV, as previously shown for human betacoronaviruses 

368 (Lau et al., 2011; Woo et al., 2006). Based on the lessons from HCoV-OC43 genotyping (Lau et 

369 al., 2011) and recent evolutionary evaluation of the diverse genetic BCoV population through 

370 pioneering in-depth sequencing analysis (Borucki et al., 2013), the deep sequencing of BCoV 

371 should therefore be performed to better understand the molecular epidemiology of BCoV, to 

372 determine genotypes and to reveal possible recombination events. 

373 
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374 MATERIALS AND METHODS 

375 

376 Clinical samples. In total, thirty three field samples; 25 fecal and 8 nasal, were sequenced from 

377 cattle in 29 herds (Table 1) from Sweden and Denmark. Sampled animals in all herds were 

378 showing clinical signs of BCoV infection. The samples were collected during outbreaks that 

379 occurred from 2002 to 2010. All seven Danish samples (one nasal and six fecal) were from 2003 

380 and 2005. The oldest Swedish strain, which was from a WD outbreak in Uppland in 1992, was 

381 also sequenced. In this study, no cell culture passaged vims was utilized. Samples were kept 

382 frozen at -70°C until analyzed. 

383 

384 RNA extraction, cDNA synthesis, primer pairs and PCR. RNA extraction with TRIzol LS 

385 reagent (Invitrogen) and cDNA synthesis with random priming were performed as described 

386 previously (Liu et al., 2006). In order to amplify and sequence the S gene (4092 nt), seven pairs 

387 of primers (Table 2) were used to generate a set of overlapping PCR products encompassing the 

388 entire S gene. Among these primers, six pairs (AF/AR, BF/BR, CF/CR, DF/DR, GF/GR, 

389 HF/HR) were already published (Hasoksuz et al., 2002; Jeong et al., 2005), while one pair 

390 (EF/ER) was designed by our group. 

391 

392 Amplification of the full-length S gene was performed in a DNA Thermal Cycler (Perkin-Elmer) 

393 using Pfu Ultra DNA polymerase (Stratagene). Briefly, lpl of cDNA was amplified in a 50pl 

394 reaction containing 5pl of lOxPfu Ultra buffer, lpl of lOmM dNTP, lpl of each AF and HR 

395 primers (lOpM), 2.5U of Pfu Ultra DNA polymerase, and 40pl of ddH20. The cycling profiles 
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396 consisted of 2min of denaturation at 95°C followed by 35 cycles of 95°C for 30s, 50°C for 60s, 

397 and 72°C for 4min, and a final extension step for 7min at 72°C. 

398 

399 In order to increase the sensitivity of the PCR detection method, nested and semi-nested PCR (N- 

400 and SN-PCR) assays were developed as described previously (Bidokhti et al., 2012). Briefly, 5pi 

401 of the first PCR product was added to a tube with 45 pi of PCR mixture, comprising 5 pi of lOx 

402 PCR buffer, lpl of lOmM dNTPs mixture, 5pl of lmg/ml bovine serum albumin, 1.5pl of each 

403 primer (lOpM), 5pi of 25mM MgCl 2 , 1U of Taq DNA polymerase (AmpliTaq; Perkin-Elmer) 

404 and 24pl of water. The thermocycling profile included 35 cycles of denaturation at 94°C for 45s, 

405 annealing at 50°C for 60s, and extension at 72°C for 3min, and a final extension at 72°C for 

406 7min. For each strain, all seven fragments (A, B, C, D, E, G and H) were amplified by the 

407 corresponding primer pairs. 

408 

409 DNA sequencing and genome analysis. All seven PCR products of each strain were purified 

410 and sequenced in both directions using the same primers as for PCR and an ABI PRISM BigDye 

411 Terminator v3.1 Cycle Sequencing Kit (Applied Biosystems, Foster City, CA) as described (Liu 

412 et al., 2006). Capillary electrophoresis was performed in an ABI 3100 genetic analyzer (Applied 

413 Biosystems). Sequence chromatograms were aligned and assembled into a final 4092-nt 

414 fragment of S gene, stretching from nt positions 23641 to 27733 (aa residues 1 to 1363 of the S 

415 glycoprotein) of the BCoV strain Mebus. 

416 

417 Sequences were aligned with the ClustalW program available in the BioEdit Sequence 

418 Alignment Editor (Hall, 1999). Phylogenetic tree construction was performed from the 
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419 nucleotide sequences using a Neighbour-Joining (NJ) algorithm with bootstrap values calculated 

420 from 1,000 replicates in the program MEGA 5 (Tamura et al., 2011). The prediction of the 

421 receptor binding domain of spike protein was performed using InterProScan (Apweiler et al., 

422 2001). The prediction of potential N-glycosylation sites in the spike proteins was performed 

423 using the CBS NetNGlyc 1.0 server (http://www.cbs.dtu.dk/services/NetNGlyc/). Reference 

424 sequences of virus species of Betacoronavirusl including BCoV, HCoV-OC43, PHEV, ECoV 

425 and BCoV-like coronaviruses in wild ruminants and also CRCoV were retrieved from GenBank 

426 and included in this analysis (Table 1). 

427 

428 Selective pressure analysis. To explore the potential overall differences in selective pressure on 

429 complete S gene sequences of the Swedish and Danish BCoV strains, we analyzed the 

430 occurrences of synonymous (dS) and nonsynonymous (dN) substitutions using SNAP server 

431 available at http://www.hiv.lanl.gov/content/sequence/SNAP/SNAP.html (Korber, 2000), which 

432 plots the cumulative and per codon occurrence of each type of substitution from start to end of 

433 the S gene. 

434 

435 In order to examine the robustness of the positive selections identified by SNAP, we also 

436 analyzed our datasets using HYPHY package accessed through the Datamonkey facility 

437 http://www.datamonkey.org (Pond & Frost, 2005). Datamonkey includes random effects 

438 likelihood (REL) for detecting sites under selection. To detect positively selected sites, default 

439 significant level of Bayes factor > 50 was used for REL. REL method is often the only method 

440 that can infer selection from small (5-15 sequences) or low divergence alignments and tends to 

441 be the most powerful test. This method was run using the GTR substitution model on a neighbor- 
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442 


joining phylogenetic tree by the Datamonkey web server in order to investigate selective 

443 pressure along S protein of BCoV strains sequenced in this study. Bootscan analysis was also 

444 used to detect possible recombination using the nucleotide alignment of the S gene sequences of 

445 virus species in Betacoronavirusesl and also CRCoV. Bootscan analysis was performed using 

446 Simplot version 3.5.1 as described previously (Lau et al. , 2011; Woo et al., 2006), with BCoV, 

447 HCoV, ECoV, PHEV and CRCoV strains as the query. 

448 

449 Evolutionary rate and estimation of divergence dates. Rate of evolution and divergence times 

450 were calculated based on S gene sequence data using a Bayesian Markov chain Monte Carlo 

451 (MCMC) approach implemented in BEAST v. 1.6.2 package (Drummond & Rambaut, 2007). 

452 Three independent runs of MCMC per dataset were performed under a strict molecular clock 

453 model, using the Hasegawa-Kishino-Yano model of sequence evolution with a proportion of 

454 invariant sites and gamma distributed rate heterogeneity (HKY+I+T) with partitions into codon 

455 positions, and the remaining default parameters in the prior’s panel. For the S gene, the MCMC 

456 run was 3x10 steps long and the posterior probability distribution of the chains was sampled 

457 every 1000 steps. Convergence was assessed on the basis of an effective sampling size after a 

458 10% burn-in using Tracer software, version 1.5 (Rambaut & Drummond, 2007). The estimations 

459 are the mean values obtained for the three runs. The mean time of the most recent common 

460 ancestor (tMRCA) and the 95% confidence interval (Cl) were calculated, and the best-fitting 

461 models were selected by a Bayes factor using marginal likelihoods implemented in Tracer 

462 (Suchard et al., 2001). 

463 


21 



464 In silico model analysis. Based on strain sequence identity and phylogenetic analysis, the aa 

465 sequences of the S protein of five CoVs were chosen: HECV-4408/US/94 (the human isolate 

466 most closely related to BCoV) and SWE/C/92 (the oldest Swedish strain clustered with HECV- 

467 4408/US/94), DEN/03-3 (the strain with highest identity to SWE/C/92), SWE/M/10-1 (the strain 

468 with lowest identity to SWE/C/92), and GER/V270/83 (a bovine reference isolate from 

469 Germany). Initially, a metathreading approach was applied in I-TASSER (Zhang, 2008; Zhang & 

470 Skolnick, 2004a), to identify templates for the subjected sequences in a non-redundant protein 

471 data bank structure library. From the generated consensus threading templates, the fragments of 

472 the sequences were assembled using modified replica-exchange Monte-Carlo simulations into 

473 3D models. In order to refine overall topology, models were clustered in SPICKER (Zhang & 

474 Skolnick, 2004b). A C-score was defined based on the quality of the threading alignments and 

475 the convergence of parameters of the structure assembly simulations. The structures were 

476 visualized and annotated in MacPyMol vl.3 (Schrodinger, LLC.). 

477 
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LEGENDS OF FIGURES 


691 

692 

693 Fig. 1. Neighbor-Joining tree based on the p-distance of the complete nucleotide S sequences of 

694 virus species Betacoronavirusl containing BCoV strains from Sweden and Denmark sequenced 

695 in this study. Bootstrap values above 70% for 1,000 iterations are shown at the branch. 

696 

697 Fig. 2. The distribution of accumulated (a) and per codon (b) positive selection sites identified 

698 using SNAP server along the S protein of the BCoV strains sequenced in this study. The two 

699 functionally distinct domains SI and S2 are marked together with the cleavage site (vertical 

700 arrow, aa residues 768-769). The first upper line represents the hypervariable regions. The 

701 regions labeled with asterisk were previously described (Bidokhti et al., 2012) and the rest were 

702 found in the study; spanning aa residues 447-596, 718-722, 785-828, 875-888, 1235-1239 and 

703 1275-1278. The second upper line represents the MAbs binding sites previously described for SI 

704 subunit (Yoo & Deregt, 2001) and for S2 subunit (Vautherot et al., 1992b) of BCoV; spanning 

705 aa residues 351-403, 517-621 and 769-798. The third upper line represents receptor binding 

706 domains previously described; N-terminal domain (NTD) spanning aa residues 15- 298 of BCoV 

707 (Peng et al., 2012) and C-domain spanning aa residues 318-510 of SARS-CoV (Wong et al., 

708 2004). The putative C-domain of the BCoV strains was predicted to span aa residues 326-540 

709 using InterProScan. The last two lines represent the negative and positive selection motifs based 

710 on accumulated dN-dS. Thicker arrows show the strong selection motifs as described in the 

711 results. 

712 

713 Fig. 3. Predicted 3D structures of S proteins belonging to several strains of coronaviruses 

714 including HECV-4408/US/94 (a), SWE/C/92 (b), DEN/03-3 (d), SWE/M/10-1 (e) and 

715 GER/V270/83 (f). In (c) The first two S proteins were aligned using MacPymole, HECV- 

716 4408/US/94 (red) and SWE/C/92 (cyan). In (g) The cleavage site of the S protein of SWE/C/92 

717 is labeled yellow (aa residues 768-769), as well as regions of the S protein under positive 

718 selection (aa residues 109-131 in red and 495-527 in green). The regions (910-1032, 1059-1234 

719 and 1245-1279) of the S2 subunit under purifying selection are marked cyan. The putative 

720 receptor binding domain (so called C-domain spanning aa residues 326-540) is colored blue and 

721 green. 
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722 Table 1. BCoV strains utilized in this study. 

723 


Strain/Isolate name Sampling Sample 


Year Origin 


SWE/C/92 

1992 

Adult cattle 

S WE/02-1 

2002 

Calf 

SWE/02-2 

2002 

Calf 

SWE/02-3 

2002 

Calf 

SWE/02-4 

2002 

Calf 

DEN/03-1 

2003 

Calf 

DEN/03-2 

2003 

Calf 

DEN/03-3 

2003 

Calf 

DEN/05-1 

2005 

Cattle 

DEN/05-2 

2005 

Cattle 

DEN/05-3 

2005 

Cattle 

DEN/05-4 

2005 

Cattle 

SWE/N/05-1 

2005 # 

Calf 

SWE/N/05-2 

2005 # 

Calf 

S WE/AC/06-1 

2006 

Adult cattle 

S WE/M/06-3* 

2006 

Calf 

S WE/M/06-4* 

2006 

Calf 

SWE/Z/07-1 

2007 

Adult cattle 

S WE/C/07-2 

2007 

Adult cattle 

SWE/I/07-3 

2007 5 

Adult cattle 

S WE/I/07-4 

2007^ 

Adult cattle 

S WE/I/07-5 

2007 

Adult cattle 

S WE/C/07-6 

2007 

Adult cattle 

SWE/AC/08-1 

2008 

Adult cattle 

SWE/C/08-2 

2008 

Adult cattle 

SWE/I/08-3 

2008 

Adult cattle 

S WE/P/09-1 

2009 

Adult cattle 

SWE/C/09-2 

2009 

Calf 

SWE/U/09-3* 

2009 

Calf 

SWE/M/10-1 

2010 

Calf 

S WE/M/10-2 

2010 

Calf 

SWE/Y/10-3 

2010 

Calf 


Sample Country Previous Accession 

Type Label name* Number 


Fecal 

Sweden 

Cl-9202 

JN795143 T 

Nasal 

Sweden 

NclN-02a 

DQ121634 

Nasal 

Sweden 

Nc2N-02 

DQ121635 

Nasal 

Sweden 

Nc3N-02 

DQ121637 

Nasal 

Sweden 

Nc4N-02 

DQ121638 

Fecal 

Denmark 

KclF-03 

DQ121631 

Fecal 

Denmark 

AclF-03 

DQ121619 

Fecal 

Denmark 

DclF-03 

DQ121622 

Fecal 

Denmark 


This study 

Fecal 

Denmark 


This study 

Nasal 

Denmark 


This study 

Fecal 

Denmark 


This study 

Fecal 

Sweden 

Nl-0511 

JN795155* 

Fecal 

Sweden 


This study 

Fecal 

Sweden 

AC1-0611 

JN795141* 

Fecal 

Sweden 


This study 

Fecal 

Sweden 

M2-0605 

JN795154* 

Fecal 

Sweden 

Z2-0711 

JN795163* 

Fecal 

Sweden 

C4-0712 

JN795146* 

Fecal 

Sweden 

13-0703 

JN795151* 

Fecal 

Sweden 


This study 

Fecal 

Sweden 


This study 

Fecal 

Sweden 

C3-0711 

JN795145* 

Fecal 

Sweden 

Yl-0801 

JN795161* 

Fecal 

Sweden 

C5-0801 

JN795147* 

Fecal 

Sweden 

14-0810 

JN795152* 

Nasal 

Sweden 

PI-0902 

JN795159* 

Nasal 

Sweden 

C6-0903 

JN795148* 

Nasal 

Sweden 

U1-0907 

JN795160* 

Fecal 

Sweden 


This study 

Fecal 

Sweden 


This study 

Fecal 

Sweden 


This study 
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SWE/P/10-4 

2010 

Calf 

Fecal 

Sweden 

This study 

GER/V270/83 

1983 

- 

- 

Germany 

EF193075 

BCoV/GER/M80844/89 

1989 

Calf 

Nasal 

Germany 

M80844.1 

BCoV/IT A/339/06* 

2006 

Cattle 

Fecal 

Italy 

EF445634 

BuCoV/ITA/179-07-11 

2007 

Buffalo calf 

Fecal 

Italy 

EU019216 

WtDCoV/OH-WD470/94 

1994 

White-tailed 

deer 

Fecal 

US, Ohio 

FJ425187.1 

BCoV/KWD2/KOR/02 

2002 

Cattle 

Fecal 

South Korea 

AY935638.1 

Nyala/KOR/lO-1 

2010 

Nyala 

Fecal 

South Korea 

HM573330.1 

BCoV/KCD2/KOR/04 

2004 

Calf 

Fecal 

South Korea 

DQ389633 

BCoV/LSU/94 

1994 

Cattle 

Nasal 

US, 

Louisiana 

AF058943 

BCoV/US/OK-0514-3/96 

1996 

Cattle 

Nasal 

US, 

Louisiana 

AF058944 

WbCoV/OH-WD35 8/94 

1994 

Waterbuck 

Fecal 

US, Ohio 

FJ425186.1 

SDCo V/US/OH-WD3 88- 
GnC/94 

1994 

Sambar deer 

- 

US, Ohio 

FJ425190.1 

WbCoV/OH-WD358- 

GnC/94 

1994 

Waterbuck 

Gn calf 

US, Ohio 

FJ425185.1 

SDCoV/OH-WD388/94 

1994 

Sambar deer 

Fecal 

US, Ohio 

FJ425189.1 

SACoV/OH-1/03 

2003 

Sable antelope 

Fecal 

US, Ohio 

EF424621.1 

BCoV/AH65-E/OH/01 

2001 

Feedlot Calf 

Fecal 

US, Ohio 

EF424615.1 

BCoV/AH65-R/OH/01 

2001 

Feedlot Calf 

Nasal 

US, Ohio 

EF424617.1 

BCoV/ENT/US/98 

1998 

Cattle 

Fecal 

US, Texas 

AF391541 

GiCoV/OH3/03 

2003 

Giraffe 

Fecal 

US, Ohio 

EF424623.1 

BCoV/AHl 87-E/OH/2000 

2000 

Feedlot Calf 

Fecal 

US, Ohio 

EF424619.1 

ACoV/OH/98 

1998 

Alpaca 

Fecal 

US, Oregon 

DQ915164.2 

BCoV/LUN/US/98 

1998 

Cattle 

Nasal 

US, Texas 

AF391542 

BCoV/DB2/84 

1984 

Cattle 

- 

US, MD 

DQ811784 

BCoV/F15/FRA/79 

1979 

- 

Fecal 

France 

D00731 

BCoV/LY138/US/65 

1965 

Cattle 

Fecal 

US, Utah 

AF058942 

B C 0 V/Mebu s/U S/7 2 

1972 

Cattle 

- 

US 

U00735 

BCoV/Quebec/ 72 

1972 

Cattle 

- 

Canada 

AF220295 

HCoV-OC43/VR759 

/UK/67 

1967 

Human 

Nasal 

England 

AY391777 

PHE V/VW572/BEL/7 2 

1972 

Pig 

Tonsil 

Belgium 

DQ011855.1 

PHEV/67N/B EL/70 

1970 

Piglet 

- 

Canada 

AY078417 

HECV-4408/US/94 

1994 

Human infant 

Fecal 

US, 

Louisiana 

L07748.1 

BCoV/438/06-2/ITA 

2006 

Feedlot calf 

Nasal 

Italy 

EU814647.1 

BCoV/Kakegawa/JAP 

1976 

Cattle 

Fecal 

Japan 

AB354579 

HCoV-OC43/BE03/BEL 

2003 

Human infant 

Nasal 

Belgium 

AY903454 
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HCoV-OC43/BE04/BEL 

2004 

Human infant 

Nasal 

Belgium 

AY903455 

HCoV-OC43/Paris/01 

2001 

Adult human 

Nasal 

France 

AY585229 

CRCoV/02/005/JAP 

2002 

Puppy 

Nasal 

Japan 

AB242262.1 

CRCoV/JAP/07 

2007 

Dog 

Nasal 

Japan 

AB 370269.1 

CRCoV-4182/UK/03 

2003 

Puppy 

Nasal 

England 

DQ682406 

CRCoV/240/05/ITA 

2005 

Dog 

Nasal 

Italy 

EU999954 

ECoV/Tokachi/09/JAP 

2009 

Horse 

Fecal 

Japan 

BAJ52885.1 

ECoV/NC/99/US 

1999 

Foal 

Fecal 

US, North 

Carolina 

EF446615.1 


724 

725 

726 

727 

728 

729 

730 

731 


* Samples were collected during warm season. 

t Strains were partially sequenced previously and their fragments A and B are available in databases. Other 
fragments of these strains were sequenced in this study. 

* The label names of strains partially sequenced in our previous studies (Bidokhti et al., 2012; Liu et al., 
2006) are designated here. 

^ j L 

Samples were collected from same farm in November 2005. 

11 Samples were collected from same farm in March 2007. 
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732 

733 


734 

735 

736 


Table 2. S gene and reference of primer pairs used in this study. 


Primer 

name 

Primer sequence (5’->3’) 

Primer 

location 

Primer 

reference 

AF f 

5’-ATG TTT TTG AT A CTT TTA ATT-3’ 

1-21 

(Hasoksuz et al., 2002) 

AR* 

5 ’-AGT ACC ACC TTC TTG ATA AA-3’ 

654-635 

(Hasoksuz et al., 2002) 

BF 

5’-ATG GCA TTG GGA TAC AG-3’ 

549-565 

(Hasoksuz et al., 2002) 

BR 

5’-TAA TGG AGA GGG CAC CGA CTT-3’ 

1039-1018 

(Hasoksuz et al., 2002) 

CF 

5 ’ -GGG TTA CAC CTC TCA CTT CT-3’ 

782-801 

(Hasoksuz et al., 2002) 

CR 

5’-GCA GGA CAA GTG CCT ATA CC-3’ 

1550-1531 

(Hasoksuz et al., 2002) 

DF 

5’-GTC CGT GTA AAT TGG ATG GG-3’ 

1460-1479 

(Hasoksuz et al., 2002) 

DR 

5’-TGT AGA GTA ATC CAC AC A GT-3’ 

2286-2267 

(Hasoksuz et al., 2002) 

EF 

5 ’ -GAA CCA GCA TTG CTA TTT CGG A -3’ 

2109-2131 

This study 

ER 

5’-TTA TAA CTT TGC AC A CAA ATG AGG TC-3’ 

2876-2851 

This study 

GF 

5’-CCC TGT ATT AGG TTG TTT AG-3’ 

2691-2710 

(Jeong et al., 2005) 

GR 

5’-ACC ACT ACC AGT GAA CAT CC-3’ 

3606-3587 

(Jeong et al., 2005) 

HF 

5’-GTG CAG AAT GCT CCA TAT GGT-3’ 

3439-3459 

(Jeong et al., 2005) 

HR 

5’-TTA GTC GTC ATG TGA TGT TT-3’ 

4092-4073 

(Jeong et al ., 2005) 


' F: Forward primer. 
* R: Reverse primer. 
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737 Table 3. Mean estimations for the rate of evolution and TMRCA of the Swedish and Danish 

738 BCoV strains in comparison with the reference strains in Betacoronavirusl. 

739 


Reference strains 

BCoV strains 


Mean rate of evolution 
substitution /site /year (xlO' 4 ) 

TMRCA 

Human (HEC-4408) 

8.3 (6.7 - 9.9)* 

1977 (1975-1980)* 

BCoV reference strains 

8.7 (7.0 - 10.5) 

1978 (1974- 1981) 

Wild ruminants 

4.4 (3.2 - 5.7) 

1963 (1954-1970) 

Canine (CRCoV) 

4.4 (3.2 - 5.5) 

1951 (1939-1961) 

Human (HCoV-OC43) 

4.1 (3.2-4.7) 

1899 (1884-1915) 

Porcine (PHEV) 

7.6 (6.0 - 9.3) 

1847 (1815 - 1875) 

Equine 

7.9 (6.2 - 9.9) 

1797 (1752-1844) 


740 95% confidence interval (Cl) values are between brackets. 

741 
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742 

743 


Table 4. REL analysis results for the S protein of the BCoV sequence strains. 


No. of 
sequences 

Mean 

dN-dS t 

No. of positively 
selected sites 

Posterior 

Probability 

Positively selected sites* 

30 

2.04 

39 

>90% 

35.112.113.115. 143. 147. 151. 





157. 188. 257. 447. 458. 471. 482. 
499, 501. 503. 510. 523. 525. 543. 
546. 573. 578. 590. 596. 718. 722. 
805,828,881,883,888, 1034, 

1120, 1206, 1237, 1239, 1278 


744 Three identical sequences were excluded from analysis. 

745 t Because dS could be 0 for some sites, Datamonkey reports dN-dS in place of dN/dS. 

746 * Positively selected sites identified with posterior probability p > 95% are in boldface. The 

747 underlined ones are also reported by SNAP. 

748 
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100 


100 





SWE/AC/08-1 

SWE/C/08-2 

SWE/C/07-2 

SWE/C/07-6 

SWE/Z/07-1 

SWE/U/09-3* 

SWE/M/06-3* 

SWE/C/09-2 
SWE/N/05-1 
SWE/N/05-2 
S WE/P/09-1 
SWE/1/07-3 
SWE/1/07-4 
SWE/1/07-5 
BCoV/ITA/339/06* 

SWE/AC/06-1 

SWE/M/06-4* 

DEN/03-2 

SWE/1/08-3 

BCoV/438/06-2/ITA 

BuCoV/ITA/179-07-11 

SWE/M/10-1 

SWE/M/10-2 

SWE/Y/10-3 

SWE/P/10-4 

SWE/02-2 

SWE/02-4 

SWE/02-3 

S WE/02-1 

DEN/03-1 

DEN/03-3 

DEN/05-3 

DEN/05-2 

DEN/05-1 

DEN/05-4 

SWE/C/92 

HECV-4408/US/94 

BCoV/GER/M80844/89 

BCoV/DB2/84 

Nyala/KOR/10-1 

BCoV/KCD2/KOR/04 

WbCoV/OH-WD358/94 

WbCoV/O H-WD358-G n C/94 

SDCoV/OH-WD388/94 

S DCoV/U S/O H-WD388-G n C/94 

BCoV/LSU/94 

BCoV/US/OK-0514-3/96 

WtDCoV/OH-WD470/94 

BCoV/KWD2/KOR/02 

ACoV/OH/98 

BCoV/LUN/US/98 

BCoV/ENT/US/98 

BCoV/AH65-R/OH/01 

SACoV/OH-1/03 

GiCoV/OH3/03 

BCoV/AH 187-E/O H/2000 

BCoV/AH65-E/OH/01 

BCoV/F15/FRA/79 

BCoV/LY 138/US/65 

BCoV/Kakegawa/JAP 

GER/V270/83 

BCoV/Mebus/US/72 

BCoV/Quebec/72 

CRCoV-4182/UK/03 

CRCoV/JAP/07 

CRCoV/02/005/JAP 

C RCo V/240/05/ IT A 

HCoV-OC43/VR759/UK/67 

HCoV-OC43/Paris/01 

HCoV-OC43/BE03/BEL 

HCoV-OC43/BE04/BEL 

PHEV/VW572/BEL/72 

PHEV/67N/BEL/70 

ECoV/T okach i/09/J AP 

ECoV/NC/99/US 


H-!-f-1-1-1-1-1 

72 76 80 84 88 92 96 100 % 


Fig. 1. 
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Amino acid position 


































































































































































































































































DEN/03-3 

(g) 


SWE/M/10-1 


GER/V270/83 


Positive selection sites 
(GREEN and RED) 



Cleavage site 
(YELLOW) 


Negative selection sites 
(CYAN) 


S2 



SWE/C/92 


Fig. 3. 











